Semi-Supervised Active Learning for Sequence Labeling
نویسندگان
چکیده
While Active Learning (AL) has already been shown to markedly reduce the annotation efforts for many sequence labeling tasks compared to random selection, AL remains unconcerned about the internal structure of the selected sequences (typically, sentences). We propose a semisupervised AL approach for sequence labeling where only highly uncertain subsequences are presented to human annotators, while all others in the selected sequences are automatically labeled. For the task of entity recognition, our experiments reveal that this approach reduces annotation efforts in terms of manually labeled tokens by up to 60 % compared to the standard, fully supervised AL scheme.
منابع مشابه
Detecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملCombining active and semi-supervised learning for spoken language understanding
In this paper, we describe active and semi-supervised learning methods for reducing the labeling effort for spoken language understanding. In a goal-oriented call routing system, understanding the intent of the user can be framed as a classification problem. State of the art statistical classification systems are trained using a large number of human-labeled utterances, preparation of which is ...
متن کاملDeep Semi-Supervised Learning with Linguistically Motivated Sequence Labeling Task Hierarchies
In this paper we present a novel Neural Network algorithm for conducting semisupervised learning for sequence labeling tasks arranged in a linguistically motivated hierarchy. This relationship is exploited to regularise the representations of supervised tasks by backpropagating the error of the unsupervised task through the supervised tasks. We introduce a neural network where lower layers are ...
متن کاملDeterministic Annealing for Semi-Supervised Structured Output Learning
In this paper we propose a new approach for semi-supervised structured output learning. Our approach uses relaxed labeling on unlabeled data to deal with the combinatorial nature of the label space and further uses domain constraints to guide the learning. Since the overall objective is non-convex, we alternate between the optimization of the model parameters and the label distribution of unlab...
متن کاملCombining Active Learning and Semi-supervised Learning Using Local and Global Consistency
Semi-supervised learning and active learning are important techniques to solve the shortage of labeled examples. In this paper, a novel active learning algorithm combining semi-supervised Learning with Local and Global Consistency (LLGC) is proposed. It selects the example that can minimize the estimated expected classification risk for labeling. Then, a better classifier can be trained with la...
متن کامل